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Modeling blood flow in larger vessels using lattice-Boltzmann methods comes with a challenging 
set of constraints: a complex geometry with walls and inlet/outlets at arbitrary orientations with 
respect to the lattice, intermediate Reynolds number, and unsteady flow. Simple bounce-back is 
one of the most commonly used, simplest, and most computationally efficient boundary conditions, 
but many others have been proposed. We implement three other methods applicable to complex ge- 
ometries (Guo-Zheng-Shi, Bouzdi-Firdaouss-Lallemand, Junk- Yang) in our open-source application 
HemeLB. We use these to simulation Poiseuille and Womersley flow in a cylindrical pipe with an 
arbitrary orientation at physiologically relevant Reynolds (1-100) and Womersley (4-12) numbers 
and compare the accuracy to analytical solutions. We find that the Bouzidi-Firdaouss-Lallemand 
method offers the best accuracy and stability properties with first-order convergence in space. Sim- 
ple bounce-back has similar behavior but with errors around 50% larger. The Guo-Zheng-Shi and 
Junk- Yang methods, while accurate at low Reynolds number, are unstable at Reynolds numbers 
> 30 and so cannot be recommended for use in hemodynamic simulation of larger arteries. The 
choice of collision operator (lattice Bhatnagar-Gross-Krook vs. multiple relaxation time) and velocity 
set (D3Q15 vs. D3Q19) does not significantly affect the accuracy in the problems studied. 

PACS numbers: 02.70.-c ; 47.11.-j ; 47.63. Cb 



I. INTRODUCTION 

In the last two decades, lattice-Boltzmann methods 
(LBM) [TH3] have been widely studied and used for fluid 
flow problems. We are particularly interested in its appli- 
cation to the study of hemodynamics: the flow behavior 
of blood under physiological conditions. Accurate sim- 
ulation of the flow of blood in an individual could have 
near-term clinical benefits for, inter alia, the treatment 
of aneurysms or stenoses [7HS]- 

Applications such as these have a number of chal- 
lenges: retaining computational performance in a rela- 
tively sparse, three-dimensional (3D) fluid domain; cap- 
turing the complex rheology of a dense suspension of 
dcformable particles; accounting for the compliance of 
vessel walls, and choosing a boundary condition method 
suitable for stable and accurate simulation. Indirect ad- 
dressing [Tni [H] offers a solution to the sparsity problem 
by precomputing the addresses of neighboring points in- 
stead of requiring that the points be stored in a dense, 
3D array. We will not examine the rheology and com- 
pliance problems in this article, restricting ourselves to a 
Newtonian fluid within a rigid-walled system, but instead 
aim to provide recommendations for the choice of bound- 
ary condition mcthod(s) for a complex geometry. This 
choice is examined in the context of two other variables: 
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the discrete velocity set and lattice-Boltzmann collision 
operator. Bearing in mind recent controversy over the re- 
producibility of computational science [T2J EI] , we have 
released the source code of our application HemeLB for 
the community's scrutiny and use. 

Latt et al. [14] considered five boundary conditions 
and assess their accuracy. However, they studied only 
boundary conditions in which the wall passes through 
a lattice point, immediately restricting their results to 
boundaries that are normal to one of the Cartesian di- 
rections of the underlying grid. Boyd et al. [15] compared 
simple bounce-back to the Guo-Zheng-Shi method [TB] in 
simulations of arterial bifurcations, finding that the two 
methods give different results when stenosis is present, 
but without any analysis suggesting which, if either, is 
more accurate. Stahl et al. [T7] performed LBM simu- 
lations with simple bounce-back boundary conditions to 
examine the effect of staircased boundaries (where walls 
that are not aligned with the underlying grid are approx- 
imated by a set of grid edges) on the measurements of 
shear stress both at the wall and in the bulk flow, finding 
errors in the shear stress of up to 35% for two-dimensional 
(2D) channel flow and 28% for 2D bent-pipe flow. They 
also introduced a method for estimating the local wall 
normal from the shear stress tensor. 

In this paper, we briefly introduce the lattice- 
Boltzmann method in Sec. [H] and discuss the collision 
operators, velocity sets and boundary conditions used. 
Our open-source software |18j and simulation protocols 
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are described, and results given in Sec. Ill We present 
our conclusions in Sec. IIVI 



II. THE LATTICE-BOLTZMANN METHOD 



a local equilibrium (see below) , in a discrete velocity ana- 
logue of the Boltzmann-BGK approximation from kinetic 
theory [2"2] . 



fi(/<) 
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Here we provide a brief summary of the lattice- 
Boltzmann method (LBM); for a full derivation, we re- 
fer the reader to one of the many thorough descriptions 
available [IHSj ■ LBM operates at a mesoscopic level, stor- 
ing a discrete-velocity approximation to the one-particle 
distribution functions of the Boltzmann equation of ki- 
netic theory [T!5], {/i(r, t)}, on a regular lattice, with grid 
spacing Ax. The set of velocities {c^} is chosen such that 
the distances travelled in one timestep (At), = CjAt, 
are lattice vectors and to ensure Galilean invariance [20] . 
When one only wishes to reproduce Navier-Stokes dy- 
namics, the set is typically a subset of the Moore neigh- 
borhood, including the rest vector. For 3D simulations, 
the most commonly used sets have 15 and 19 members 
(termed D3Q15 and D3Q19, respectively). The LBM 
can be shown, through a Chapman-Enskog expansion, 
to reproduce the Navier-Stokes equations in the quasi- 
incompressible limit with errors proportional to the Mach 
number squared. 

In the absence of forces, the density p(r, t) and the 
velocity u(r, t) at a fluid site can be calculated from the 
distributions by 



p = £/« 

i 

/? u = £ 



(1) 

(2) 



where r is the relaxation time. This can be shown, 
through a Chapman-Enskog expansion (see, e.g., [2"I |2"3"] 1 . 
to reproduce the Navier-Stokes equations with a kine- 
matic viscosity v given by 



u = c 2 s (r - At/2). 



(6) 



LBGK is simple to implement, gives excellent perfor- 
mance, and is therefore widely used. 

Multiple relaxation time (MKT) collision opera- 
tors, developed at the same time as LBGK [24] . general- 
ize the notion of relaxation towards local equilibrium by 
allowing different relaxation rates for different moments 
of the distributions, potentially improving stability prop- 
erties |25j . The eigenvalue of the collision matrix which 
corresponds to the relaxation of shear stress A s hcar deter- 
mines the viscosity as in ([6| (r — > 1/A s hcar)- We consider 
the MKT operator on the D3Q15 and D3Q19 lattices, as 
described by d'Humieres et al. [2"5] . 

For the equilibrium distribution, we use a second-order 
(in velocity space) approximation to a Maxwellian distri- 
bution 



/fHP> u ) = P w i 1 + 



Ci ■ u (a ■ u) 



2d 



2d 
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where the weights Wi and speed of sound c s depend on 
the choice of velocity set. 



Advancing the system one timestep is conceptually di- 
vided into two stages. The first is collision, which relaxes 
the distributions towards a local equilibrium (we denote 
the post-collisional distributions as /*): 



f*(r,t) = f i (v,t) + n(f i (r,t)), 



(3) 



where Q is a the collision operator. The second is stream- 
ing, where the post-collisional distributions are propa- 
gated along the lattice vectors to new locations in the 
lattice, defining the distributions of the next timestep: 



f i (r + c i At,t + At) = ft(v,t). 



A. Collision operators 



(4) 



The collision operator approximates the microscopic 
interparticle interactions. Here we summarize the two 
considered in this article: lattice Bhatnagar-Gross-Krook 
and multiple relaxation time. 

Lattice Bhatnagar-Gross-Krook (LBGK) [201 EH] 
approximates the collision process as relaxation towards 



B. No-slip boundary conditions 

Boundary conditions for LBM have some additional 
complications compared to boundary conditions for 
Navier-Stokes based methods, due to the LBM's meso- 
scopic nature. Typically, one wishes to impose conditions 
on the macroscopic, hydrodynamic variables (p, u) but 
these must be implemented through a closure relation 
for the mesoscopic distributions. There is no single, ob- 
viously superior choice. In this section, we will briefly re- 
view some commonly used boundary condition methods 
for lattice-Boltzmann models which impose the no-slip 
condition that the velocity of fluid adjacent to a wall is 
equal to the velocity of the wall. 

Many boundary condition methods do not vary their 
behavior with respect to the location of the walls in re- 
lation to the Eulerian grid. The wall is often assumed 
to pass infinitesimally close to a point, the grid point re- 
maining inside the fluid domain (sometimes referred to 
as "wet node" boundary conditions [2]), or alternatively 
the boundary is considered to be halfway along the lattice 
vector to a point outside the fluid. In the case of com- 
plex or non-lattice-aligned domains, these methods will 
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always cause a first-order modeling error, irrespective of 
the order of numerical accuracy of the resulting lattice- 
Boltzmann method [2B]. This point has been studied 
numerically by Stahl et al. j!7j. Other boundary condi- 
tions allow the wall to be at an arbitrary position along 
the link between a solid and a fluid site. These reduce 
the modeling error of fixed wall position methods, but 
often at the price of increased complexity and/or the re- 
quirement of data from neighboring fluid lattice points. 
Further, a number of methods suffer reduced accuracy at 
sites in corners, which can reduce the accuracy of simu- 
lations throughout the domain. 

Simple bounce-back (SBB) is perhaps the most 
widely used boundary condition for solid walls, posi- 
tioning them halfway along the lattice vector from fluid 
to solid. It is straightforward to implement and gives 
second-order accurate simulation results for flat bound- 
aries aligned with the Cartesian axes of the lattice [57] , 
although in more complex cases this degrades to first- 
order accuracy [28] . It is also computationally cheap and 
can be local in its operation. SBB ensures conservation 
of mass up to machine precision. In this work we use the 
halfway bounce-back scheme. 

There are a number of popular methods 29 - 32J that 
can only operate on straight, axis-aligned planes and 
force the boundary to pass directly through the lattice 
point. Malaspinas et al. [33J generalized the regular- 
ized method [31] to cope accurately with corner nodes, 
although the authors acknowledge that it fails for the 
case of a D3Q15 lattice and a right-angled corner. These 
methods are, therefore, unsuitable for problems involving 
complex boundaries. 

The Bouzidi-Firdaouss-Lallemand (BFL) 
method [34] starts with simple bounce-back, but 
interpolates the value of the to-be-propagated distri- 
bution with the distribution at the fluid site which 
standard bounce-back would stream it to. They present 
two variants: one using linear interpolation and another 
using quadratic interpolation. In the present work, we 
restrict our attention to the linear case, due to its locality 
and smaller communication requirement (indeed, it can 
be implemented without any inter-process communi- 
cation above the normal latticc-Boltzmann streaming 
step, albeit at a price of revisiting the sites adjacent to 
the wall); the authors claim that both variants show 
second-order convergence, but with the linear method 
having a poorer prefactor [34] , However inspection of 
Fig. 6 in [31] appears to show first-order convergence, at 
least for the largest resolution results shown. 

Guo, Zheng, and Shi (GZS) [16] present a bound- 
ary condition which decomposes the unknown distribu- 
tions at the wall into equilibrium and non-equilibrium 
parts. The equilibrium part uses the density of the fluid 
site and a linearly extrapolated velocity such that the 
velocity at the solid wall is as imposed. For the non- 
equilibrium part, the value from the fluid site (or the 
next site into the bulk) is used. 

Junk and Yang (JY) [35] propose a correction to 



the simple bounce-back method. They claim an advan- 
tage compared to interpolation-based methods such as 
GZS and BFL as their method is completely local and is 
able to handle non-straight boundaries where sites have 
lattice vectors in opposite directions, both crossing the 
solid-fluid boundary. The method arises from their anal- 
ysis |36) of boundary conditions for LBM in terms of 
general methods for studying finite difference schemes 
rather than the standard Chapman-Enskog expansion. 
By adding correction terms to the collision operator and 
then discretizing them in an optimal (under their anal- 
ysis framework) manner, they ensure the Navier-Stokcs 
equations are obeyed with the correct boundary condi- 
tions. 

We also note that some authors propose the use of 
grid-refinement |37j . finite difference [38] . and finite vol- 
ume [39] [40] discretizations of the discrete Boltzmann 
equation as methods for improving accuracy around non- 
planar boundaries, but we will restrict our attention here 
to implementations on a single, regular grid. Addition- 
ally, the immersed boundary method (IBM) [41] has been 
used in conjunction with the LBM to simulate rigid [H] 
and deformable [43) boundaries. IBM requires a further 
layer of fluid sites outside the walls as well as another, 
simpler boundary condition at the edge of the halo re- 
gion, but admits an extension to moving boundaries in a 
natural way. 

C. Open boundary conditions 

For hemodynamic applications, we must be able to 
specify inlet and outlet boundary conditions. If the ve- 
locity and pressure at the open boundary is known, then 
one can choose Dirichlct boundary conditions. However, 
if one wishes to use Dirichlet conditions at both inlets 
and outlets, the total mass in the system can increase or 
decrease unboundedly unless the mass fluxes into and out 
from the system balance exactly [H]. Alternatively, one 
could impose mixed Dirichlet-Neumann boundary condi- 
tions: 

p = Po, (8a) 
u,| = 0, (8b) 

n • V«i = 0, (8c) 

where n is the inward pointing normal of the open bound- 
ary. 

A number of authors [301 ESH3Z] have proposed open 
boundary conditions that fulfill some or all of these re- 
quirements, however the techniques cited are only suit- 
able for inlets aligned with the lattice's axes. We have 
therefore developed a simple method that meets these 
requirements. 

Assume that, at the start of a timestep, all distribu- 
tions for an inlet /outlet site are known. LBM then pro- 
ceeds as normal: a collision step, followed by a streaming 
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step; the distributions that would have come from exte- 
rior sites now have an undefined value. We close the sys- 
tem by constructing a "phantom site" (indicated below 
with a subscript V) beyond the boundary, whose hydro- 
dynamic variables are estimated based on the imposed 
values and those at the inlet/outlet site (indicated with 
subscript T). Note that there is one phantom site per 
missing distribution, i.e., the phantom sites of adjacent 
inlet /outlet sites are unrelated. This is to eliminate the 
need for communication between neighboring sites. The 
equilibrium distribution for the missing direction is com- 
puted at the phantom site and then streamed into place. 
For the density, we assume the target pressure po, i.e., we 
perform a zero-order extrapolation from the inlet plane. 
Condition ( 8b I is enforced by projecting away any ve- 



locity component not parallel to the inlet/outlet normal, 
n. For condition (8c), we take first-order finite-difference 



approximations for the derivatives, giving: 



n • Vm_l 



u±{r x ,t) - u ± (r v ,t) 
Ci a ■ nAt 



= 0. 



Hence 



u(r v ,t) « (u(r z ,i) • n)n. 



(9) 



(10) 



For lattice sites that are adjacent to both open and closed 
boundaries, SBB is applied to the distributions in the 
direction of the wall. In the data presented in Sec. Ill B 



we see that this boundary condition gives good agreement 
with the expected velocity fields, but in simulations at 
higher Reynolds number (data not shown) we observe 
larger errors, especially at these boundaries. This will be 
explored further in subsequent work. 



III. SIMULATIONS 

Our goal is to determine which combination of bound- 
ary condition, collision operator and velocity set gives 
the best all-round accuracy in a general, complex geom- 
etry, with a focus on computational hemodynamics. We 
also assess the computational requirements of the differ- 
ent models. 

We compare against analytical solutions, which re- 
stricts us to relatively simple domains: we choose a 
cylinder with both steady, pressure-driven flow (Hagen- 
Poiseuille flow) and with a time-dependent, sinusoidal 
pressure gradient (Womersley flow). By choosing a non- 
axis-aligned orientation for the cylinder, we better mimic 
a typical production simulation of the human vascula- 
ture. The orientation n was chosen pseudorandomly from 
the unit sphere, subject to the constraint that n e^ < 0.9, 
Vz. The value is 



n = [-0.299,0.382,0.874]. 



(11) 



Our approach is to select parameters in lattice units, but 
with physiologically relevant Reynolds (Re) and Womer- 
sley (a) numbers. 



A. Software: HemeLB 

The simulations in this paper were performed with 
HemeLB [IT] , a lattice-Boltzmann-based fluids solver, 
which includes capability for in situ imaging of flow- 
fields and real-time steering [35] . It is a distributed mem- 
ory application, parallelized with MPI. We have shown 
that HemeLB 's computational performance scales lin- 
early up to at least 32,768 cores [3H] ■ We have released 
the software online [TS], under the open-source GNU 
Lesser General Public License (LGPL), to enable inter- 
ested researchers to reproduce our results as well as to 
use the software for novel problems. 

HemeLB has several linked components, described in 
Groen et al. [49]. We have recently re-developed the 
lattice-Boltzmann core to allow for easy switching be- 
tween use of different velocity sets, collision operators, 
and boundary conditions, through a statically polymor- 
phic, object-oriented design. This avoids any run-time 
overhead due to dynamic polymorphism. The individ- 
ual software components are tested through a battery of 
over one hundred unit and regression tests, which are run 
nightly by our continuous integration server. 

HemeLB includes, amongst other features: the 
D3Q15 and D3Q19 velocity sets; the lattice Bhatnagar- 
Gross-Krook (LBGK) and multiple relaxation time 
(MRT) collision operators; and, the simple bounce- 
back (SBB), Guo-Zheng-Shi (GZS), Bouzidi-Firdaouss- 
Lallemand (BFL), and Junk- Yang (JY) boundary condi- 
tion methods. 

The software includes a separate tool for defining the 
simulation domain. This requires either a geometric 
primitive or a general surface, meshed with triangles. 
The user can then place inlets and outlets, specify their 
pressure conditions, and select the fineness of the lattice. 
This setup tool then generates the input for HemeLB 
itself, producing a description of each fluid site and, if 
needed, the location of the wall. 



B. Convergence analysis 

In this section we report on a series of simulations 
of Hagen-Poiseuille flow over a range of resolutions and 
Reynolds numbers, defined here as 



Re = U max D/u, 



(12) 



where U max is the maximum velocity, D the pipe diame- 
ter, and v the kinematic viscosity. The velocity U is 



U 



((D/2) 



(13) 



where r is the distance from the cylinder axis, defined by 
n, and Vp is the pressure gradient. 

In Table [I] we list all the parameters chosen for the 
simulations. The range of Re spans typical values for 
cerebral arteries in the human body [SO]. For each case 
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TABLE I. Parameters for convergence analysis. All values are 
given in lattice units. Parameters are, from left to right: the 
Reynolds number Re; the cylinder diameter D; the predicted 
Mach number Ma; the LBGK relaxation time r; the relative 
density difference imposed Ap/po; the momentum diffusion 
time T mom = D 2 /v, and the sound propagation time T s = 
L/c s . 



with the velocity field found by simulation, u(r,i). We 
define the velocity error as 



Re 


D 


Ma 


T 




Tmom 


T s 


1 


12 


0.0241 


1.00 


0.01850 


864 


42 


1 


24 


0.0120 


1.00 


0.00463 


3460 


83 


I 


48 


0060 


1.00 


001 1 6 


13800 


166 


1 


96 


0.0030 


1.00 


0.00029 


55300 


333 


1 


192 


0.0015 


1.00 


0.00007 


221000 


665 


30 


12 


0.3610 


0.75 


0.13900 


1730 


42 


30 


24 


0.1800 


0.75 


0.03470 


6910 


83 


30 


48 


0.0902 


0.75 


0.00868 


27600 


166 


30 


96 


0.0451 


0.75 


0.00217 


111000 


333 


30 


192 


0.0226 


0.75 


0.00054 


442000 


665 


100 


12 


0.4810 


0.60 


0.07410 


4320 


42 


100 


24 


0.2410 


0.60 


0.01850 


17300 


83 


100 


48 


0.1200 


0.60 


0.00463 


69100 


166 


100 


96 


0.0601 


0.60 


0.00116 


276000 


333 


100 


192 


0.0301 


0.60 


0.00029 


1110000 


665 



we vary the diameter D from 12 — 192 lattice units; the 
length of tube used is given by L — 2D. Due to the fi- 
nite speed of sound in LBM (c s = 1/ -\/3 for the models 
used here), we list the Mach numbers (Ma = ?7 max /c s ); 
the lowest resolution simulations in each have extremely 
high values and will consequently have poor accuracy, 
but this allows us to assess the convergence behavior at 
modest computational expense. Next we show the value 
of the LBGK relaxation time r which must be greater 
than 1/2 [T] and not be much greater than one |51j . We 
hold t constant in lattice units while refining the spatial 
resolution, which implies diffusive scaling of the timestep 
(when converted to physical units), i.e., At oc Ax 2 . The 
density, and hence pressure, difference Ap driving the 
flow is also reported; this must remain much less than 
the reference density of the simulation po- Finally, we 
list the time for momentum to diffuse across the cylin- 
der's diameter, T mom = D 2 /v, and the time for a sound 
wave to propagate the length of the cylinder, T s = L/c s , 
to give some idea of the time required for the simulation 
to converge to a steady state. To determine whether a 
simulation has indeed converged, we compute the maxi- 
mum difference between flow fields at two times 



Au(ii,i 2 



max r ||u(r,£i) - u(r,i 2 

Urn QV 



(14) 



and require that Au(t,t + 1) < 10 -7 . 

We use a simple initialization procedure, initializing to 
a uniform density fluid at rest. This approach is general 
and can be applied to any geometry without requiring 
any preprocessing step [52H54] . but does require longer 
simulation times until a steady solution is reached. For 
each simulation we compare the Poiseuille solution, U(r), 



u*(r,t) = u(r,t) -U(r) 



(15) 



and use the £ 2 -norm scaled by the predicted velocity 
range as our measure of error: 



Nm&Xr IIUK 



(16) 



These are evaluated over all fluid sites in the central 90% 
of the cylinder, thus excluding the inlet and outlet sites. 
For the pressure gradient, we use the measured differ- 
ence at the edge of this volume divided by the distance 
between the two planes. This is to disentangle the errors 
due to the open boundary condition method from the 
no-slip condition. We hope to return in the future to a 
full validation of the open boundary condition method. 

The simulations were performed on HECToR, the UK 
national supercomputer, using between 32 and 16384 
cores. The number of cores for each run was chosen to 
minimize the run time while remaining efficient which we 
have shown to occur at around 10 3 sites per core [3H] ; we 
take the integer power of two that is closest to this. 

In Fig. [I] we show the velocity errors as a function of 
increasing resolution for each of the four boundary con- 
dition methods and the two velocity sets. We use only 
LBGK as the collision operator. We note that many of 
the simulations using the Guo-Zheng-Shi and Junk- Yang 
boundary conditions became unstable during the simula- 
tion, with the lattice-Boltzmann distributions becoming 
negative. These methods only ran until a converged so- 
lution was obtained for the Re = 1 case. 

The Junk- Yang method with a D3Q15 velocity set 
showed particularly poor stability, only converging for 
the lowest resolution case and to a poor approximation 
of the expected flow. Our unit test suite demonstrates 
(data not shown) that the JY implementation reduces to 
SBB when the wall is a plane normal to one of the axes 
and halfway between the site and its non-fluid neighbors, 
as expected [35] ■ In the D3Q19 case, JY gives better ac- 
curacy than SBB but poorer than BFL. The GZS method 
on D3Q19 also had poor stability, failing for nearly all the 
simulations, while GZS with D3Q15 gave good results for 
the lower resolution Re = 1 simulations. 

Simple bounce-back and Bouzidi-Firdaouss-Lallemand 
were both reliable, remaining stable at all the reported 
Reynolds number and resolutions, except for BFL with 
D3Q15 at the highest resolutions. These both show first- 
order convergence to the analytical solution as the space 
step is reduced, as expected. BFL shows better agree- 
ment with the solutions than SBB, having errors around 
one third lower across all simulations. As the Reynolds 
number is increased, we note that the errors also in- 
crease, particularly when going from Re = 30 to 100. 
This may be due to the combination of the increase in 
Mach number and the decrease in r towards the lower 
limit of 0.5 EH- 
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FIG. 1. (Color online.) Convergence of velocity error residuals (e„) as defined in Eq. (161 vs. diameter in lattice units, (a) 
Reynolds number Re = 1; (b) Re = 30; (c) Re = 100. Colors (black, red, blue, cyan) indicate boundary condition (SBB, BFL, 
GZS, JY), triangles indicate D3Q15 and crosses D3Q19. The solid lines are guides to the eye showing first-order convergence. 
Missing symbols are due to simulations becoming unstable for that combination of LBM and parameters. 



C. Womersley flow 

Here we report on simulations of oscillatory flow, in 
order to explore the different boundary condition meth- 
ods' effect on accuracy for time-dependent cases. The 
Womersley number (a) is a dimensionless number gov- 
erning dynamical similarity in cases of oscillatory flow. 
It relates to the ratio of transient forces to viscous forces 
(or alternatively the ratio of diameter to boundary layer 
growth during one period of oscillation) and is defined 
as: 



(17) 



where uj is the characteristic angular frequency of the 
pressure oscillation and L is a characteristic length scale. 

In the case of a cylinder (radius R, axis n) and lam- 
inar flow with zero average pressure gradient (Vp(i) = 
(A/L) sinwt, uj = 2tt/T osc ), the time-dependent Navier- 
Stokes equations admit an analytic solution [55) : 



u(r,t) = -5R 




J (i 



3/2, 



l) 



1 J„{i 3 / 2 a) ' ' 



n, (18) 



where Jo is the order-0 Bessel function of the first kind 
and a is defined using the following characteristic quan- 
tities: cylinder radius, the pressure oscillation angular 
frequency, and the fluid viscosity. 

In the larger arteries of the human body, Womersley 
numbers range approximately between 4 and 20 |50) . so 
we select from this range. For these simulations, we de- 
fine the Reynolds number as that for a Poiseuille flow 
with a pressure gradient given by the amplitude of the im- 
posed gradient. We note that in the body, the Reynolds 
number scales approximately with a, so we select param- 
eters only from a single curve within the Re — a plane, as 



TABLE II. Parameters for Womersley flow simulations. All 
values are given in lattice units. Parameters are: Reynolds 
number (defined for a Poiseuille flow with the maximum pres- 
sure gradient); Womersley number; predicted Mach number 
(based on the maximum Poiseuille flow); the LBGK relax- 
ation time; the maximum relative density difference; the pe- 
riod of the pressure oscillation, and the momentum diffusion 
time. 



Re 


a 


Ma 


T 


Ap/po 






T s 


30 


4 


0.043 


0.620 


0.00200 


5655 


57600 


166 


100 


8 


0.078 


0.565 


0.00194 


2618 


107000 


166 


300 


12 


0.113 


0.531 


0.00135 


2417 


222000 


166 



shown in Table [LT) We use the same cylinder orientation 
as in Sec. 



Ill B| and select a diameter D = 48 Ax, as this 
gives a reasonable balance between computational cost 
and accuracy, such as would be chosen for production 
simulations; we keep L = 2D. We initialize the sim- 
ulation to a constant pressure with zero velocity (with 
the phase of the pressure oscillation chosen such that the 
driving difference is zero at simulation start). We al- 
low the simulation to run until Au(t, t — T osc ) < 10 -7 is 
reached for all sample points during one oscillation; how- 
ever, to reduce the amount of data collected, we record 
data only for those points within one lattice unit of an 
axis-normal plane halfway along the cylinder. 

We run these simulations for all combinations of LBM 
components and compute residuals at four sample points 
during one pressure oscillation period using Eq. ( 16 ). The 



four residuals are then reduced by taking the root-mean- 
square average and the maximum respectively, effectively 
extending the averaging/maximization over time as well 
as space. 

We found that all the simulations with the Guo-Zheng- 
Shi and Junk- Yang boundary conditions became unsta- 
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TABLE III. Error residuals for pulsatile flow simulations with 
different LBM components. On the left are the boundary con- 
dition method (BC); collision operator (CO); and velocity set 
(DmQn). On the right are the error residuals e„ for differ- 
ent values of Womersley number a (other parameters are as 
shown in Table |Tl|. 



BC 


CO 


DmQn 


a = 4 


a = 8 


a = 12 


SBB 


LBGK 


D3Q15 


0.025 


0.043 


0.061 


BFL 


LBGK 


D3Q15 


0.018 


0.027 


0.041 


SBB 


LBGK 


D3Q19 


0.026 


0.043 


0.060 


BFL 


LBGK 


D3Q19 


0.020 


0.028 


0.040 


SBB 


MRT 


D3Q15 


0.024 


0.043 


0.065 


BFL 


MRT 


D3Q15 


0.018 


0.027 


0.041 


SBB 


MRT 


D3Q19 


0.027 


0.046 


0.069 


BFL 


MRT 


D3Q19 


0.020 


0.028 


0.041 



TABLE IV. Performance per core for different combinations 
of boundary condition, collision operator and velocity set. 
The columns indicate the combination of collision operator 
and velocity set, while the rows indicate the boundary condi- 
tion. Unstable, and hence aborted, simulations are indicated 
with a dash (-). The values are given in millions of site up- 
dates per second (MSUPS) followed by the time to perform 
one boundary condition site update, relative to SBB, in paren- 
theses. 

LBGK MRT 
D3Q15 D3Q19 D3Q15 D3Q19 

~SBB L86 (I) L59 (I) 067 (1) 0A5 (Tf 
BFL 1.70 (2.0) 1.46 (2.0) 0.61 (2.1) 0.42 (1.8) 
GZS 1.40 (4.7) - 0.60 (2.3) 0.40 (2.4) 

JY 0.54 (28) - 0.35 (11) 




FIG. 2. (Color online.) Error residuals e„ as defined in 



Eq. ( 16 1 for simulations of pulsatile flow vs. Womersley num- 



ber a for the simple bounce-back (SBB, black) and Bouzidi- 
Firdaouss-Lallemand (BFL, red) boundary conditions. Bars 
show the range of error residuals measured across all combi- 
nations of collision operator and velocity set. 



ble before the end of the first period, as might have been 
expected since they failed for all the Poiseuillc simula- 
tions with Re > 30 in Sec. |IIIB| The results of the sim- 
ple bounce-back and Bouzidi-Firdaouss-Lallemand simu- 
lations are shown in Table [TTTl We see that all simulations 
well approximate the analytical solutions, with errors in 
the range 1.8-6.9%. 

In Fig. [2j we show the range of error residuals for SBB 
and BFL against Womersley number. This clearly shows 
the the BFL method offers superior accuracy to SBB in 
all cases, irrespective of the choice of collision operator 
and velocity set. Choosing MRT over LBGK does not 
offer any benefits, however we have not varied the re- 
laxation rates for the non-stress tensor moments of the 
distributions (e.g., by projecting out all the "ghost" ki- 
netic modes every timestep). We note again that we have 
adopted the parameters from [25] without optimization. 



The choice of velocity set does not make a significant 
difference to the accuracy. 



D. Relative performance 

The relative performance of the options is germane 
to the choice of which combination of lattice-Boltzmann 
components to use. In Table IV we give the measured 



per-core performance for the lowest Re/a pulsatile flow 
simulations. All these runs were performed on HEC- 
ToR, a Cray XE6 supercomputer with two 16-core AVID 
Opteron 2.3 GHz Interlagos processors per node. The 
simulations used 256 cores. The performance is given in 
millions of site updates per second (MSUPS) and is based 
on timings that include only the lattice-Boltzmann up- 
dates. We also estimate the relative time to perform one 
site update for the different methods, assuming that an 
SBB site update takes the same time as a bulk fluid site 
update. The simulation domain includes 173,706 sites of 
which 9% are at the solid-fluid boundary and hence use 
the various boundary condition methods. 

We do not place much emphasis on these results as 
we have not significantly optimized the implementation 
of boundary conditions (other than SBB) and the MRT 
collision operator is implemented naively. Nonetheless, 
simple bounce-back gives the best computational perfor- 
mance in all cases while BFL only needs approximately 
twice the compute time across the different simulations. 
The Junk- Yang implementation requires a number of lin- 
ear algebra operations at every timestep (implemented in 
HemeLB with the Boost uBLAS library), explaining its 
relatively poor performance. 



IV. CONCLUSION 

The majority of benchmark problems reported in the 
lattice-Boltzmann literature use lattice-aligned geome- 
tries, rather than the complex domains required by many 
applications. We have performed a comparison between 



LBM simulation solutions, from our open source appli- 
cation HemeLB [18] . and analytical solutions in a non- 
lattice-aligned, curved domain up to Reynolds numbers 
of 100, with steady and unsteady flow. We have varied 
the resolution of the grid used and the different compo- 
nents of the algorithm (collision operator, velocity set, 
no-slip boundary condition). We find that at these mod- 
erate values of Re, the choice of velocity set and collision 
operator do not greatly affect accuracy or stability, but 
that the choice of no-slip boundary condition method is 
critical. 

The Guo-Zheng-Shi and Junk- Yang methods, while of- 
fering good accuracy for steady flow at low Reynolds 
number, show poor stability at larger Re. They are 
therefore unsuitable for our hemodynamic applications or 
other applications that require even moderate Reynolds 
numbers. 

We find that simple bounce-back (SBB) and the 
Bouzidi-Firdaouss-Lallemand (BFL) methods both show 
first-order convergence over a wide range of resolutions 
and Reynolds numbers, but with SBB having errors 
about 50% larger. The BFL algorithm is a factor of two 
slower than SBB but, since it is only used at the bound- 
aries, it will cause an increase in CPU time by a factor of 
< 1.1. Reaching equivalent accuracy with SBB requires 



enhancing the resolution by a factor of ~ 1.5; due to 
the diffusive scaling of the timestep, this increases CPU 
time by a factor of ~ 1.5 5 ps 7.5, as well as increasing 
total memory consumption by a factor of ~ 1.5 3 ~ 3. 
We therefore recommend BFL in preference to SBB. Al- 
though SBB is less accurate, it is still acceptable for use 
in less demanding applications and when development 
time is at a premium. We hope that these results will 
prove helpful to the community when selecting methods 
for simulating hemodynamics and comparable applica- 
tions with latticc-Boltzmann methods. 
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